Overview

Dataset statistics

Number of variables22
Number of observations899
Missing cells971
Missing cells (%)4.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory154.6 KiB
Average record size in memory176.1 B

Variable types

Numeric10
Categorical12

Alerts

trestbps is highly correlated with trestbpdHigh correlation
trestbpd is highly correlated with trestbps and 1 other fieldsHigh correlation
df_index is highly correlated with restecg and 1 other fieldsHigh correlation
cp is highly correlated with exangHigh correlation
restecg is highly correlated with df_indexHigh correlation
prop is highly correlated with nitrHigh correlation
nitr is highly correlated with prop and 2 other fieldsHigh correlation
pro is highly correlated with nitr and 1 other fieldsHigh correlation
thaldur is highly correlated with thalachHigh correlation
thalach is highly correlated with thaldur and 2 other fieldsHigh correlation
thalrest is highly correlated with thalachHigh correlation
tpeakbps is highly correlated with xhypoHigh correlation
tpeakbpd is highly correlated with trestbpdHigh correlation
exang is highly correlated with cp and 2 other fieldsHigh correlation
xhypo is highly correlated with tpeakbpsHigh correlation
oldpeak is highly correlated with exang and 1 other fieldsHigh correlation
num is highly correlated with oldpeakHigh correlation
dataset is highly correlated with df_index and 2 other fieldsHigh correlation
trestbps has 61 (6.8%) missing values Missing
dig has 70 (7.8%) missing values Missing
prop has 68 (7.6%) missing values Missing
nitr has 67 (7.5%) missing values Missing
pro has 65 (7.2%) missing values Missing
diuretic has 83 (9.2%) missing values Missing
thaldur has 58 (6.5%) missing values Missing
thalach has 57 (6.3%) missing values Missing
thalrest has 58 (6.5%) missing values Missing
tpeakbps has 65 (7.2%) missing values Missing
tpeakbpd has 65 (7.2%) missing values Missing
trestbpd has 61 (6.8%) missing values Missing
exang has 57 (6.3%) missing values Missing
xhypo has 60 (6.7%) missing values Missing
oldpeak has 64 (7.1%) missing values Missing
df_index is uniformly distributed Uniform
df_index has unique values Unique
oldpeak has 361 (40.2%) zeros Zeros

Reproduction

Analysis started2022-10-19 18:23:19.341940
Analysis finished2022-10-19 18:23:28.461478
Duration9.12 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct899
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean449.192436
Minimum0
Maximum900
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2022-10-19T20:23:28.520297image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile44.9
Q1224.5
median449
Q3673.5
95-th percentile854.1
Maximum900
Range900
Interquartile range (IQR)449

Descriptive statistics

Standard deviation259.9377296
Coefficient of variation (CV)0.5786778867
Kurtosis-1.198555665
Mean449.192436
Median Absolute Deviation (MAD)225
Skewness0.002376309891
Sum403824
Variance67567.62328
MonotonicityStrictly increasing
2022-10-19T20:23:28.610754image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.1%
6741
 
0.1%
5921
 
0.1%
5931
 
0.1%
5941
 
0.1%
5951
 
0.1%
5961
 
0.1%
5971
 
0.1%
5981
 
0.1%
5991
 
0.1%
Other values (889)889
98.9%
ValueCountFrequency (%)
01
0.1%
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
ValueCountFrequency (%)
9001
0.1%
8991
0.1%
8981
0.1%
8971
0.1%
8961
0.1%
8951
0.1%
8941
0.1%
8921
0.1%
8911
0.1%
8901
0.1%

age
Real number (ℝ≥0)

Distinct50
Distinct (%)5.6%
Missing2
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean53.46934225
Minimum28
Maximum77
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2022-10-19T20:23:28.694563image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum28
5-th percentile37
Q147
median54
Q360
95-th percentile68
Maximum77
Range49
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.441987279
Coefficient of variation (CV)0.1765869353
Kurtosis-0.3832049976
Mean53.46934225
Median Absolute Deviation (MAD)7
Skewness-0.1805064406
Sum47962
Variance89.15112379
MonotonicityNot monotonic
2022-10-19T20:23:28.776142image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5451
 
5.7%
5840
 
4.4%
5538
 
4.2%
5236
 
4.0%
5636
 
4.0%
5735
 
3.9%
5135
 
3.9%
6234
 
3.8%
5934
 
3.8%
5333
 
3.7%
Other values (40)525
58.4%
ValueCountFrequency (%)
281
 
0.1%
293
 
0.3%
301
 
0.1%
312
 
0.2%
325
0.6%
332
 
0.2%
347
0.8%
3510
1.1%
366
0.7%
3711
1.2%
ValueCountFrequency (%)
772
 
0.2%
762
 
0.2%
753
 
0.3%
747
0.8%
731
 
0.1%
724
 
0.4%
715
 
0.6%
707
0.8%
6913
1.4%
689
1.0%

sex
Categorical

Distinct2
Distinct (%)0.2%
Missing2
Missing (%)0.2%
Memory size7.1 KiB
1.0
709 
0.0
188 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2691
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row1.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0709
78.9%
0.0188
 
20.9%
(Missing)2
 
0.2%

Length

2022-10-19T20:23:28.844211image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:28.904664image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0709
79.0%
0.0188
 
21.0%

Most occurring characters

ValueCountFrequency (%)
01085
40.3%
.897
33.3%
1709
26.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1794
66.7%
Other Punctuation897
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01085
60.5%
1709
39.5%
Other Punctuation
ValueCountFrequency (%)
.897
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2691
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01085
40.3%
.897
33.3%
1709
26.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII2691
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01085
40.3%
.897
33.3%
1709
26.3%

cp
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.4%
Missing2
Missing (%)0.2%
Memory size7.1 KiB
4.0
484 
3.0
201 
2.0
167 
1.0
 
45

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2691
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row3.0
3rd row2.0
4th row4.0
5th row3.0

Common Values

ValueCountFrequency (%)
4.0484
53.8%
3.0201
22.4%
2.0167
 
18.6%
1.045
 
5.0%
(Missing)2
 
0.2%

Length

2022-10-19T20:23:28.956914image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:29.020359image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
4.0484
54.0%
3.0201
22.4%
2.0167
 
18.6%
1.045
 
5.0%

Most occurring characters

ValueCountFrequency (%)
.897
33.3%
0897
33.3%
4484
18.0%
3201
 
7.5%
2167
 
6.2%
145
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1794
66.7%
Other Punctuation897
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0897
50.0%
4484
27.0%
3201
 
11.2%
2167
 
9.3%
145
 
2.5%
Other Punctuation
ValueCountFrequency (%)
.897
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2691
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.897
33.3%
0897
33.3%
4484
18.0%
3201
 
7.5%
2167
 
6.2%
145
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII2691
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.897
33.3%
0897
33.3%
4484
18.0%
3201
 
7.5%
2167
 
6.2%
145
 
1.7%

trestbps
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct59
Distinct (%)7.0%
Missing61
Missing (%)6.8%
Infinite0
Infinite (%)0.0%
Mean132.2279236
Minimum80
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2022-10-19T20:23:29.089190image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum80
5-th percentile105
Q1120
median130
Q3140
95-th percentile160
Maximum200
Range120
Interquartile range (IQR)20

Descriptive statistics

Standard deviation18.6004158
Coefficient of variation (CV)0.140669348
Kurtosis0.6414193952
Mean132.2279236
Median Absolute Deviation (MAD)10
Skewness0.632380023
Sum110807
Variance345.9754678
MonotonicityNot monotonic
2022-10-19T20:23:29.177270image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120128
14.2%
130112
12.5%
140100
 
11.1%
11058
 
6.5%
15056
 
6.2%
16050
 
5.6%
12528
 
3.1%
11519
 
2.1%
13518
 
2.0%
12816
 
1.8%
Other values (49)253
28.1%
(Missing)61
 
6.8%
ValueCountFrequency (%)
801
 
0.1%
921
 
0.1%
942
 
0.2%
956
 
0.7%
961
 
0.1%
981
 
0.1%
10015
1.7%
1011
 
0.1%
1023
 
0.3%
1043
 
0.3%
ValueCountFrequency (%)
2004
 
0.4%
1921
 
0.1%
1902
 
0.2%
1851
 
0.1%
18012
1.3%
1783
 
0.3%
1741
 
0.1%
1722
 
0.2%
17013
1.4%
1652
 
0.2%

restecg
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing4
Missing (%)0.4%
Memory size7.1 KiB
0.0
537 
2.0
182 
1.0
176 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2685
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0537
59.7%
2.0182
 
20.2%
1.0176
 
19.6%
(Missing)4
 
0.4%

Length

2022-10-19T20:23:29.250453image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:29.316207image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0537
60.0%
2.0182
 
20.3%
1.0176
 
19.7%

Most occurring characters

ValueCountFrequency (%)
01432
53.3%
.895
33.3%
2182
 
6.8%
1176
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1790
66.7%
Other Punctuation895
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01432
80.0%
2182
 
10.2%
1176
 
9.8%
Other Punctuation
ValueCountFrequency (%)
.895
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2685
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01432
53.3%
.895
33.3%
2182
 
6.8%
1176
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII2685
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01432
53.3%
.895
33.3%
2182
 
6.8%
1176
 
6.6%

dig
Categorical

MISSING

Distinct2
Distinct (%)0.2%
Missing70
Missing (%)7.8%
Memory size7.1 KiB
0.0
800 
1.0
 
29

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2487
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0800
89.0%
1.029
 
3.2%
(Missing)70
 
7.8%

Length

2022-10-19T20:23:29.374930image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:29.437301image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0800
96.5%
1.029
 
3.5%

Most occurring characters

ValueCountFrequency (%)
01629
65.5%
.829
33.3%
129
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1658
66.7%
Other Punctuation829
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01629
98.3%
129
 
1.7%
Other Punctuation
ValueCountFrequency (%)
.829
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2487
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01629
65.5%
.829
33.3%
129
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII2487
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01629
65.5%
.829
33.3%
129
 
1.2%

prop
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)0.2%
Missing68
Missing (%)7.6%
Memory size7.1 KiB
0.0
617 
1.0
214 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2493
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0617
68.6%
1.0214
 
23.8%
(Missing)68
 
7.6%

Length

2022-10-19T20:23:29.490143image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:29.553742image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0617
74.2%
1.0214
 
25.8%

Most occurring characters

ValueCountFrequency (%)
01448
58.1%
.831
33.3%
1214
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1662
66.7%
Other Punctuation831
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01448
87.1%
1214
 
12.9%
Other Punctuation
ValueCountFrequency (%)
.831
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2493
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01448
58.1%
.831
33.3%
1214
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII2493
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01448
58.1%
.831
33.3%
1214
 
8.6%

nitr
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)0.2%
Missing67
Missing (%)7.5%
Memory size7.1 KiB
0.0
611 
1.0
221 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2496
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0611
68.0%
1.0221
 
24.6%
(Missing)67
 
7.5%

Length

2022-10-19T20:23:29.607922image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:29.670930image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0611
73.4%
1.0221
 
26.6%

Most occurring characters

ValueCountFrequency (%)
01443
57.8%
.832
33.3%
1221
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1664
66.7%
Other Punctuation832
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01443
86.7%
1221
 
13.3%
Other Punctuation
ValueCountFrequency (%)
.832
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2496
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01443
57.8%
.832
33.3%
1221
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII2496
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01443
57.8%
.832
33.3%
1221
 
8.9%

pro
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)0.2%
Missing65
Missing (%)7.2%
Memory size7.1 KiB
0.0
690 
1.0
144 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2502
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0690
76.8%
1.0144
 
16.0%
(Missing)65
 
7.2%

Length

2022-10-19T20:23:29.726337image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:29.792108image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0690
82.7%
1.0144
 
17.3%

Most occurring characters

ValueCountFrequency (%)
01524
60.9%
.834
33.3%
1144
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1668
66.7%
Other Punctuation834
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01524
91.4%
1144
 
8.6%
Other Punctuation
ValueCountFrequency (%)
.834
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2502
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01524
60.9%
.834
33.3%
1144
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII2502
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01524
60.9%
.834
33.3%
1144
 
5.8%

diuretic
Categorical

MISSING

Distinct2
Distinct (%)0.2%
Missing83
Missing (%)9.2%
Memory size7.1 KiB
0.0
725 
1.0
91 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2448
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0725
80.6%
1.091
 
10.1%
(Missing)83
 
9.2%

Length

2022-10-19T20:23:29.851601image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:29.923030image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0725
88.8%
1.091
 
11.2%

Most occurring characters

ValueCountFrequency (%)
01541
62.9%
.816
33.3%
191
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1632
66.7%
Other Punctuation816
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01541
94.4%
191
 
5.6%
Other Punctuation
ValueCountFrequency (%)
.816
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2448
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01541
62.9%
.816
33.3%
191
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII2448
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01541
62.9%
.816
33.3%
191
 
3.7%

thaldur
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct86
Distinct (%)10.2%
Missing58
Missing (%)6.5%
Infinite0
Infinite (%)0.0%
Mean8.654458977
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2022-10-19T20:23:29.988997image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.1
Q16
median8.1
Q310.5
95-th percentile16
Maximum24
Range23
Interquartile range (IQR)4.5

Descriptive statistics

Standard deviation3.750466761
Coefficient of variation (CV)0.433356582
Kurtosis0.869856633
Mean8.654458977
Median Absolute Deviation (MAD)2.1
Skewness0.8049512327
Sum7278.4
Variance14.06600093
MonotonicityNot monotonic
2022-10-19T20:23:30.072484image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
993
 
10.3%
765
 
7.2%
660
 
6.7%
1051
 
5.7%
848
 
5.3%
1145
 
5.0%
1245
 
5.0%
439
 
4.3%
535
 
3.9%
1332
 
3.6%
Other values (76)328
36.5%
(Missing)58
 
6.5%
ValueCountFrequency (%)
11
 
0.1%
1.54
 
0.4%
1.71
 
0.1%
1.81
 
0.1%
211
1.2%
2.31
 
0.1%
2.51
 
0.1%
322
2.4%
3.12
 
0.2%
3.21
 
0.1%
ValueCountFrequency (%)
241
 
0.1%
211
 
0.1%
206
 
0.7%
1912
1.3%
1815
1.7%
175
 
0.6%
16.51
 
0.1%
166
 
0.7%
1511
1.2%
14.41
 
0.1%

thalach
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct119
Distinct (%)14.1%
Missing57
Missing (%)6.3%
Infinite0
Infinite (%)0.0%
Mean137.2767221
Minimum60
Maximum202
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2022-10-19T20:23:30.155772image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile95
Q1120
median140
Q3157
95-th percentile178
Maximum202
Range142
Interquartile range (IQR)37

Descriptive statistics

Standard deviation25.98962849
Coefficient of variation (CV)0.1893229099
Kurtosis-0.4888487026
Mean137.2767221
Median Absolute Deviation (MAD)20
Skewness-0.1977123555
Sum115587
Variance675.4607892
MonotonicityNot monotonic
2022-10-19T20:23:30.541365image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15042
 
4.7%
14040
 
4.4%
12035
 
3.9%
13029
 
3.2%
16026
 
2.9%
11021
 
2.3%
12520
 
2.2%
17020
 
2.2%
12216
 
1.8%
10014
 
1.6%
Other values (109)579
64.4%
(Missing)57
 
6.3%
ValueCountFrequency (%)
601
0.1%
631
0.1%
671
0.1%
691
0.1%
701
0.1%
711
0.1%
722
0.2%
731
0.1%
771
0.1%
781
0.1%
ValueCountFrequency (%)
2021
 
0.1%
1951
 
0.1%
1941
 
0.1%
1921
 
0.1%
1902
0.2%
1882
0.2%
1871
 
0.1%
1862
0.2%
1854
0.4%
1844
0.4%

thalrest
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct75
Distinct (%)8.9%
Missing58
Missing (%)6.5%
Infinite0
Infinite (%)0.0%
Mean75.48394768
Minimum37
Maximum139
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2022-10-19T20:23:30.628108image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum37
5-th percentile55
Q165
median74
Q384
95-th percentile100
Maximum139
Range102
Interquartile range (IQR)19

Descriptive statistics

Standard deviation14.73875816
Coefficient of variation (CV)0.1952568541
Kurtosis0.7372781955
Mean75.48394768
Median Absolute Deviation (MAD)10
Skewness0.6372528714
Sum63482
Variance217.230992
MonotonicityNot monotonic
2022-10-19T20:23:30.712293image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7041
 
4.6%
7433
 
3.7%
8030
 
3.3%
6829
 
3.2%
7527
 
3.0%
7226
 
2.9%
6426
 
2.9%
7325
 
2.8%
8425
 
2.8%
7824
 
2.7%
Other values (65)555
61.7%
(Missing)58
 
6.5%
ValueCountFrequency (%)
371
 
0.1%
391
 
0.1%
401
 
0.1%
431
 
0.1%
441
 
0.1%
462
0.2%
471
 
0.1%
494
0.4%
504
0.4%
511
 
0.1%
ValueCountFrequency (%)
1391
 
0.1%
1341
 
0.1%
1253
0.3%
1241
 
0.1%
1203
0.3%
1191
 
0.1%
1161
 
0.1%
1152
 
0.2%
1122
 
0.2%
1106
0.7%

tpeakbps
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct73
Distinct (%)8.8%
Missing65
Missing (%)7.2%
Infinite0
Infinite (%)0.0%
Mean171.6306954
Minimum84
Maximum240
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2022-10-19T20:23:30.793606image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum84
5-th percentile130
Q1155
median170
Q3190
95-th percentile220
Maximum240
Range156
Interquartile range (IQR)35

Descriptive statistics

Standard deviation25.732959
Coefficient of variation (CV)0.149932149
Kurtosis0.1682096447
Mean171.6306954
Median Absolute Deviation (MAD)18
Skewness0.04011954318
Sum143140
Variance662.1851791
MonotonicityNot monotonic
2022-10-19T20:23:30.871722image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18097
 
10.8%
16095
 
10.6%
17081
 
9.0%
19066
 
7.3%
20058
 
6.5%
15046
 
5.1%
14040
 
4.4%
22024
 
2.7%
21021
 
2.3%
13018
 
2.0%
Other values (63)288
32.0%
(Missing)65
 
7.2%
ValueCountFrequency (%)
841
 
0.1%
901
 
0.1%
921
 
0.1%
982
 
0.2%
1001
 
0.1%
1104
0.4%
1121
 
0.1%
1151
 
0.1%
1161
 
0.1%
1209
1.0%
ValueCountFrequency (%)
2405
 
0.6%
2351
 
0.1%
2321
 
0.1%
23014
1.6%
2281
 
0.1%
2241
 
0.1%
22024
2.7%
2161
 
0.1%
2155
 
0.6%
21021
2.3%

tpeakbpd
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct51
Distinct (%)6.1%
Missing65
Missing (%)7.2%
Infinite0
Infinite (%)0.0%
Mean87.27697842
Minimum11
Maximum134
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2022-10-19T20:23:30.956275image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile65
Q180
median88
Q3100
95-th percentile110
Maximum134
Range123
Interquartile range (IQR)20

Descriptive statistics

Standard deviation14.74729177
Coefficient of variation (CV)0.1689711541
Kurtosis0.9190349164
Mean87.27697842
Median Absolute Deviation (MAD)10
Skewness-0.1278148627
Sum72789
Variance217.4826146
MonotonicityNot monotonic
2022-10-19T20:23:31.042156image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80162
18.0%
90130
14.5%
100114
12.7%
7059
 
6.6%
11043
 
4.8%
9528
 
3.1%
8525
 
2.8%
6023
 
2.6%
7523
 
2.6%
7822
 
2.4%
Other values (41)205
22.8%
(Missing)65
 
7.2%
ValueCountFrequency (%)
111
 
0.1%
261
 
0.1%
402
 
0.2%
451
 
0.1%
502
 
0.2%
551
 
0.1%
562
 
0.2%
583
 
0.3%
6023
2.6%
623
 
0.3%
ValueCountFrequency (%)
1341
 
0.1%
1302
 
0.2%
12015
 
1.7%
1184
 
0.4%
1162
 
0.2%
1158
 
0.9%
1142
 
0.2%
1121
 
0.1%
11043
4.8%
1082
 
0.2%

trestbpd
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct33
Distinct (%)3.9%
Missing61
Missing (%)6.8%
Infinite0
Infinite (%)0.0%
Mean83.61575179
Minimum50
Maximum120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2022-10-19T20:23:31.121111image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile70
Q180
median80
Q390
95-th percentile100
Maximum120
Range70
Interquartile range (IQR)10

Descriptive statistics

Standard deviation9.847479168
Coefficient of variation (CV)0.1177706228
Kurtosis0.1751540253
Mean83.61575179
Median Absolute Deviation (MAD)8
Skewness0.09062921997
Sum70070
Variance96.97284597
MonotonicityNot monotonic
2022-10-19T20:23:31.190917image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
80258
28.7%
90157
17.5%
7088
 
9.8%
10064
 
7.1%
8542
 
4.7%
7824
 
2.7%
9523
 
2.6%
7521
 
2.3%
8215
 
1.7%
8814
 
1.6%
Other values (23)132
14.7%
(Missing)61
 
6.8%
ValueCountFrequency (%)
502
 
0.2%
581
 
0.1%
6012
 
1.3%
644
 
0.4%
656
 
0.7%
661
 
0.1%
684
 
0.4%
7088
9.8%
728
 
0.9%
749
 
1.0%
ValueCountFrequency (%)
1201
 
0.1%
1107
 
0.8%
1062
 
0.2%
1055
 
0.6%
1041
 
0.1%
1021
 
0.1%
10064
7.1%
9812
 
1.3%
967
 
0.8%
9523
 
2.6%

exang
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)0.2%
Missing57
Missing (%)6.3%
Memory size7.1 KiB
0.0
513 
1.0
329 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2526
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0513
57.1%
1.0329
36.6%
(Missing)57
 
6.3%

Length

2022-10-19T20:23:31.264797image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:31.331370image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0513
60.9%
1.0329
39.1%

Most occurring characters

ValueCountFrequency (%)
01355
53.6%
.842
33.3%
1329
 
13.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1684
66.7%
Other Punctuation842
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01355
80.5%
1329
 
19.5%
Other Punctuation
ValueCountFrequency (%)
.842
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2526
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01355
53.6%
.842
33.3%
1329
 
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2526
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01355
53.6%
.842
33.3%
1329
 
13.0%

xhypo
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)0.2%
Missing60
Missing (%)6.7%
Memory size7.1 KiB
0.0
817 
1.0
 
22

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2517
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0817
90.9%
1.022
 
2.4%
(Missing)60
 
6.7%

Length

2022-10-19T20:23:31.386231image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:31.448427image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0817
97.4%
1.022
 
2.6%

Most occurring characters

ValueCountFrequency (%)
01656
65.8%
.839
33.3%
122
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1678
66.7%
Other Punctuation839
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01656
98.7%
122
 
1.3%
Other Punctuation
ValueCountFrequency (%)
.839
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2517
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01656
65.8%
.839
33.3%
122
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII2517
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01656
65.8%
.839
33.3%
122
 
0.9%

oldpeak
Real number (ℝ)

HIGH CORRELATION
MISSING
ZEROS

Distinct52
Distinct (%)6.2%
Missing64
Missing (%)7.1%
Infinite0
Infinite (%)0.0%
Mean0.8707784431
Minimum-2.6
Maximum6.2
Zeros361
Zeros (%)40.2%
Negative12
Negative (%)1.3%
Memory size7.1 KiB
2022-10-19T20:23:31.511918image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-2.6
5-th percentile0
Q10
median0.5
Q31.5
95-th percentile3
Maximum6.2
Range8.8
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation1.081203585
Coefficient of variation (CV)1.241651758
Kurtosis1.145810293
Mean0.8707784431
Median Absolute Deviation (MAD)0.5
Skewness1.027919483
Sum727.1
Variance1.169001192
MonotonicityNot monotonic
2022-10-19T20:23:31.590752image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0361
40.2%
182
 
9.1%
275
 
8.3%
1.547
 
5.2%
328
 
3.1%
0.519
 
2.1%
2.516
 
1.8%
1.415
 
1.7%
1.214
 
1.6%
1.614
 
1.6%
Other values (42)164
18.2%
(Missing)64
 
7.1%
ValueCountFrequency (%)
-2.61
0.1%
-21
0.1%
-1.51
0.1%
-1.11
0.1%
-12
0.2%
-0.91
0.1%
-0.81
0.1%
-0.71
0.1%
-0.52
0.2%
-0.11
0.1%
ValueCountFrequency (%)
6.21
 
0.1%
5.61
 
0.1%
51
 
0.1%
4.22
 
0.2%
47
0.8%
3.81
 
0.1%
3.71
 
0.1%
3.64
0.4%
3.52
 
0.2%
3.42
 
0.2%

num
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.6%
Missing2
Missing (%)0.2%
Memory size7.1 KiB
0.0
404 
1.0
190 
3.0
131 
2.0
130 
4.0
42 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2691
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row3.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0404
44.9%
1.0190
21.1%
3.0131
 
14.6%
2.0130
 
14.5%
4.042
 
4.7%
(Missing)2
 
0.2%

Length

2022-10-19T20:23:31.660609image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:31.732674image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0404
45.0%
1.0190
21.2%
3.0131
 
14.6%
2.0130
 
14.5%
4.042
 
4.7%

Most occurring characters

ValueCountFrequency (%)
01301
48.3%
.897
33.3%
1190
 
7.1%
3131
 
4.9%
2130
 
4.8%
442
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1794
66.7%
Other Punctuation897
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01301
72.5%
1190
 
10.6%
3131
 
7.3%
2130
 
7.2%
442
 
2.3%
Other Punctuation
ValueCountFrequency (%)
.897
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2691
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01301
48.3%
.897
33.3%
1190
 
7.1%
3131
 
4.9%
2130
 
4.8%
442
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII2691
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01301
48.3%
.897
33.3%
1190
 
7.1%
3131
 
4.9%
2130
 
4.8%
442
 
1.6%

dataset
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
hungarian
295 
cleveland
282 
long-beach-va
199 
switzerland
123 

Length

Max length13
Median length9
Mean length10.15906563
Min length9

Characters and Unicode

Total characters9133
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhungarian
2nd rowhungarian
3rd rowhungarian
4th rowhungarian
5th rowhungarian

Common Values

ValueCountFrequency (%)
hungarian295
32.8%
cleveland282
31.4%
long-beach-va199
22.1%
switzerland123
13.7%

Length

2022-10-19T20:23:31.798964image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-19T20:23:31.874055image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
hungarian295
32.8%
cleveland282
31.4%
long-beach-va199
22.1%
switzerland123
13.7%

Most occurring characters

ValueCountFrequency (%)
a1393
15.3%
n1194
13.1%
e886
9.7%
l886
9.7%
h494
 
5.4%
g494
 
5.4%
c481
 
5.3%
v481
 
5.3%
r418
 
4.6%
i418
 
4.6%
Other values (9)1988
21.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8735
95.6%
Dash Punctuation398
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1393
15.9%
n1194
13.7%
e886
10.1%
l886
10.1%
h494
 
5.7%
g494
 
5.7%
c481
 
5.5%
v481
 
5.5%
r418
 
4.8%
i418
 
4.8%
Other values (8)1590
18.2%
Dash Punctuation
ValueCountFrequency (%)
-398
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8735
95.6%
Common398
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1393
15.9%
n1194
13.7%
e886
10.1%
l886
10.1%
h494
 
5.7%
g494
 
5.7%
c481
 
5.5%
v481
 
5.5%
r418
 
4.8%
i418
 
4.8%
Other values (8)1590
18.2%
Common
ValueCountFrequency (%)
-398
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII9133
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1393
15.3%
n1194
13.1%
e886
9.7%
l886
9.7%
h494
 
5.4%
g494
 
5.4%
c481
 
5.3%
v481
 
5.3%
r418
 
4.6%
i418
 
4.6%
Other values (9)1988
21.8%

Interactions

2022-10-19T20:23:26.967480image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.179365image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.901253image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.582612image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.309154image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.010210image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.749827image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.451868image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.458621image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.193964image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:27.033266image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.246437image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.965800image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.651327image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.373286image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.081452image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.817097image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.527054image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.526943image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.265696image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:27.103426image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.316770image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.030529image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.720474image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.438920image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.152436image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.882500image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.598673image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.600237image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.335462image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:27.175228image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.398722image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.098977image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.791473image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.509371image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.228265image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.953005image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.937830image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.671943image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.407950image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:27.245626image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.479323image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.166647image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.860286image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.577432image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.299649image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.022146image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.007866image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.744003image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.479108image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:27.319719image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.554176image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.237210image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.935827image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.650356image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.375683image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.093428image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.082651image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.817822image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.558582image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:27.384589image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.622845image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.303626image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.009093image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.720600image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.447229image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.166984image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.155653image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.888475image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.650627image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:27.457245image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.696955image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.376879image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.088569image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.793881image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.528617image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.241647image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.233080image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.968085image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.734369image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:27.527403image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.766180image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.445649image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.164640image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.863234image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.604443image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.312621image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.308619image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.042331image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.819977image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:27.600169image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:20.837955image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:21.517977image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.239476image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:22.944361image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:23.679486image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:24.386586image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:25.388446image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.120891image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-19T20:23:26.895029image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-10-19T20:23:31.948828image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-19T20:23:32.086564image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-19T20:23:32.229311image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-19T20:23:32.363124image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-10-19T20:23:32.471431image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-10-19T20:23:27.729783image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-19T20:23:27.968632image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-10-19T20:23:28.146286image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-10-19T20:23:28.383156image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexagesexcptrestbpsrestecgdigpropnitrprodiureticthaldurthalachthalresttpeakbpstpeakbpdtrestbpdexangxhypooldpeaknumdataset
0040.01.02.0140.00.00.00.00.00.00.018.0172.086.0200.0110.086.00.00.00.00.0hungarian
1149.00.03.0160.00.00.00.00.00.00.010.0156.0100.0220.0106.090.00.00.01.01.0hungarian
2237.01.02.0130.01.00.00.00.00.00.010.098.058.0180.0100.080.00.00.00.00.0hungarian
3348.00.04.0138.00.00.00.00.00.00.05.0108.054.0210.0106.086.01.00.01.53.0hungarian
4454.01.03.0150.00.00.00.01.01.00.02.0122.074.0130.0100.090.00.01.00.00.0hungarian
5539.01.03.0120.00.00.00.00.00.00.019.0170.086.0198.0100.080.00.00.00.00.0hungarian
6645.00.02.0130.00.00.00.00.00.00.010.0170.090.0200.0106.084.00.00.00.00.0hungarian
7754.01.02.0110.00.00.00.00.00.00.019.0142.056.0220.070.070.00.00.00.00.0hungarian
8837.01.04.0140.00.00.00.00.00.00.015.0130.063.0190.0100.080.01.00.01.51.0hungarian
9948.00.02.0120.00.00.00.00.00.00.07.0120.072.0140.080.080.00.00.00.00.0hungarian

Last rows

df_indexagesexcptrestbpsrestecgdigpropnitrprodiureticthaldurthalachthalresttpeakbpstpeakbpdtrestbpdexangxhypooldpeaknumdataset
88989051.00.04.0114.02.00.01.00.00.00.04.096.052.0140.096.074.00.00.01.00.0long-beach-va
89089162.01.04.0160.01.01.00.01.01.01.03.5108.069.0160.090.080.01.00.03.04.0long-beach-va
89189253.01.04.0144.01.00.00.01.00.00.04.0128.076.0150.0102.094.01.00.01.53.0long-beach-va
89289446.01.04.0134.00.00.00.00.00.00.05.5126.088.0174.0114.090.00.00.00.02.0long-beach-va
89389554.00.04.0127.01.00.01.01.00.00.07.5154.083.0158.084.078.00.00.00.01.0long-beach-va
89489662.01.01.0NaN1.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN0.0long-beach-va
89589755.01.04.0122.01.00.01.01.00.01.05.3100.074.0210.0100.070.00.00.00.02.0long-beach-va
89689858.01.04.0NaN2.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN0.0long-beach-va
89789962.01.02.0120.02.00.01.00.00.00.06.793.067.0164.0110.080.01.00.00.01.0long-beach-va
898900NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNlong-beach-va